Helping Novices Avoid the Hazards of Data: Leveraging Ontologies to Improve Model Generalization Automatically with Online Data Sources
نویسندگان
چکیده
ods produce a model from a set of examples. Despite the maturity of these algorithms, decisions that result from models are unlikely to be correct if data have been used indiscriminately. This is part of the so-called data-dredging problem (Smith and Shah 2002). Figure 1 shows a bemusing example: U.S. spending per annum on science, space, and technology is highly correlated with suicides by hanging, strangulation, and suffocation. Logically, we know that this correlation does not imply causation (that is, higher spending on technology cannot cause more suicides by hanging or vice versa). Unfortunately, without the capacity to distinguish real and spurious correlations, learning methods are prone to picking up such correlations in producing models (Tukey 1977). The onus to be judicious ultimately falls on the person building the model. Otherwise, data dredging may lead to specious models, which may overfit, generalize poorly, and suggest conclusions that are fallacious. Articles
منابع مشابه
Leveraging Ontologies to Improve Model Generalization Automatically with Online Data Sources
This paper describes an end-to-end learning framework that allows a novice to create a model from data easily by helping structure the model building process and capturing extended aspects of domain knowledge. By treating the whole modeling process interactively and exploiting high-level knowledge in the form of an ontology, the framework is able to aid the user in a number of ways, including i...
متن کاملInnovative Applications of Artificial Intelligence 2015
SUMMER 2016 5 The AAAI Conference on Innovative Applications of Artificial Intelligence (IAAI) was founded in 1989 to showcase the successful application of artificial intelligence technology to real-world problems and its deployment into the hands of end users. Since then, we have seen examples of AI applied to domains as varied as medicine, education, manufacturing, transportation, user model...
متن کاملPitfalls in Ontologies and TIPS to Prevent Them
A growing number of ontologies are already available thanks to development initiatives in many different fields. In such ontology developments, developers must tackle a wide range of difficulties and handicaps, which can result in the appearance of anomalies in the resulting ontologies. Therefore, ontology evaluation plays a key role in ontology development. OOPS! is an on-line tool that automa...
متن کاملOntological Assistance for Knowledge Discovery in Databases Process
The dramatically explosion of data and the growing number of different data sources are exposing researchers to a new challenge how to acquire, maintain and share knowledge from large databases in the context of rapidly applied and evolving research. This paper describes a research of an ontological approach for leveraging the semantic content of ontologies to improve knowledge discovery in dat...
متن کاملThe Current Landscape of Pitfalls in Ontologies
A growing number of ontologies are already available thanks to development initiatives in many different fields. In such ontology developments, developers must tackle a wide range of difficulties and handicaps, which can result in the appearance of anomalies in the resulting ontologies. Therefore, ontology evaluation plays a key role in ontology development projects. OOPS! is an on-line tool th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- AI Magazine
دوره 37 شماره
صفحات -
تاریخ انتشار 2016